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Attorney' s Docket No. 9052- 1 1 1 PATENT 

IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

In re: Antson et al. Examiner: To be assigned 

Serial No.: PCT/GBOO/03568 Group Art Unit: To be assigned 

Filed: September 18, 2000 

For: Target for Antiviral Therapy 

Date: March 15,2002 

Box PCT 

U.S. Patent and Trademark Office 
P.O. Box 2327 
Arlington, VA 22202 
Attn: DO/EO/US 

PRELIMINARY AMENDMENT 

Sir: 

Prior to the examination of the above application, please amend the above 
referenced application as follows. Please enter the following amendments prior to the 
calculation of the filing fee. Pursuant to the rules for amendments under 37 C.F.R. 
§1.121, the claims have been amended herein using the rewritten claims format. The 
present amendment also includes a section entitled "VERSION WITH MARKINGS 
TO SHOW CHANGES MADE" attached hereto. 



IN THE SPECIFICATION: 

Please amend the specification as follows: 

On page 1 after the title of the invention please add: 



RELATED APPLICATIONS 

This application claims priority under 35 U.S.C. § 371 from 
PCT/GBOO/03568, (published under PCT Article 21(2) in English), filed on 
September 18, 2000, which claims priority to Great Britain Apphcation Serial No. 
9921938.8, filed on September 17, 1999, the disclosures of which are incorporated by 
reference herein in their entireties. 
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IN THE CLAIMS: 

Please enter the amended claims as follows: 

2. (Amended) The E2NT dimer protein of Claim 1 wherein the residues lie on 
opposite sides of an N-terminal domain. 

3. (Amended) The E2NT dimer protein of Claim 1, wherein the residues 
comprise a plurality of residue clusters associated with a structural role at an interface 
between Nl and N2 terminal domains of respective monomers within the dimer. 

4. (Amended) The E2NT dimer protein of Claim 3 comprising three clusters. 

5. (Amended) The E2NT dimer protein of Claim 3 wherein a first cluster of vital 
residues is associated Avith interactions between Nl and N2 domains and comprises 
any one or more of the following residues: Ile82, Glu90, Trp92, Lysll2, Tyrl38, 
Vall45. 

6. (Amended) The E2NT dimer protein of Claim 3, wherein a second cluster of 
residues is associated with Nl interactions and comprises either or both of residues 
Trp33 and Leu94. 

7. (Amended) The E2NT dimer protein of Claim 3, wherein a third cluster of 
residues is associated with N2 interactions and comprises any one or more of the 
following residues: Prol06, Lysl 1 1, Phel68, Trpl34. 

8. (Amended) The E2NT dimer protein of Claim 1, further comprising residues 
associated with transactivation and/or replication properties of E2. 

9. (Amended) The E2NT dimer of Claim 8, wherein the residues comprise any 
one or more of the following residues: Glu20, GlulOO, Aspl22, Arg37, Glu39, Ile73, 
Glnl2and Ala69. 
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10. (Amended) A method for determining the structure of a crystallised molecular 
complex of an E2 N-terminal module (E2NT) dimer protein, wherein the E2NT dimer 
protein and any of its mutations are mapped onto an E2 three-dimensional structure so 
as to identify areas of amino acid conservation and the effect of mutations on folding 
of the E2 protein. 

11. (Amended) The method according to Claim 10 in rationalised antiviral drug 
design. 

12. (Amended) A method for identifying and/or selecting a candidate therapeutic 
agent, the method comprising: 

determining interaction of a E2 N-terminal module (E2NT) dimer in a sample 
by contacting said sample with said candidate therapeutic agent and measuring DNA 
loop formation in E2 in vitro. 

13. (Amended) The method according to Claim 12 further comprising identifying 
cind/or selecting an antiviral candidate therapeutic agent. 

14. (Amended) The method according to Claim 13, wherein the identifying and/or 
selecting of the antiviral candidate therapeutic agent depends on its ability to interfere 
with or block interactions of E2NT so as to interfere or block viral and/or cellular 
transcription factors. 

15. (Amended) A method of treating an HPV infection in a subject comprising: 
introducing an E2NT dimerisation inhibitor in said subject. 

16. (Amended) The method according to Claim 15 further comprising treating 
warts, proliferative skin lesions and/or cervical cancer. 

18. (Amended) Use of a dimerisation surface of an crystallised molecular complex 
of an E2 N-terminal module (E2NT) dimer protein or homologue thereof according to 
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Please cancel claim 33 without prejudice or disclaimer. 
Please add the following new claims: 

34. (New) A method for designing a potential antiviral compound for the prevention 
or treatment of an HPV infection, comprising: 

a) obtaining crystals of the E2NT dimer protein such that the three dimensional 
structure of the crystallized E2NT dimer protein can be determined to a resolution of 
about 1 .9 A or better, 

b) evaluating the three dimensional structure of the crystallized E2NT dimer protein; 

c) synthesizing the potential antiviral compound based on the three-dimensional 
crystal structure of the crystallized E2NT dimer protein; 

d) contacting an HPV virus with the potential antiviral compound; and 

e) eissaying the HPV virus for infectivity or monitoring the virus for activity, or both, 
whereby a decrease in the infectivity of the virus or a change in the activity of the 
virus indicates the compound may be used for the prevention or treatment of an HPV 
infection. 

35. (New) The method according to claim 34, wherein the antiviral compound is a 
peptide or polypeptide. 

36. (New) The method according to claim 34, wherein the E2 N-terminal module 
dimer protein or homologue thereof comprises residues vital for transcriptional and 
replication activities of said protein. 
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37. (New) The method according to claim 36, wherein the residues comprise a 
plurality of residue clusters associated with a structural role at an interface between 
Nl and N2 terminal domains of respective monomers within the dimer. 

38. (New) The method according to claim 37, wherein said E2NT dimer protein 
comprises three clusters. 

39. (New) A method for designing a candidate compound for screening for binding to 
or inhibition of an HPV infection, comprising: 

a) utilizing the three dimensional structure of a crystallized E2NT module dimer 
protein wherein the residues comprise a plurality of residue clusters associated with a 
structural role at an interface between Nl and N2 terminal domains of respective 
monomers within the dimer; and 

b) designing a candidate binding compound based on the three-dimensional crystal 
structure of the crystallized E2NT dimer protein for binding to said dimer protein. 

40. (New) The method of claim 39, wherein the candidate compound is a peptide or 
polypeptide. 

41. (New) A method for determining the crystallised molecular complex of an E2 N- 
terminal module (E2NT) dimer protein, wherein the E2NT dimer protein and any of 
its mutations is mapped onto an E2 three-dimensional structure so as to identify areas 
of amino acid conservation and the effect of mutations on folding of the E2 protein. 
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REMARKS 

Please note that the claims pending at the time of this filing are the claims of 
the international application serial no, PCT/GBOO/03568, /.e. claims 1-33. Claim 33 
has been cancelled and claims 34-41 have been added. The pending claims have been 
amended above to better conform to United States practice. The marked-up version 
of the changes to the specification and claims are attached hereto in the "Version With 
Markings to Show Changes Made". 

It is respectively submitted that this application is now in condition for 
substantive examination, which action is respectfully requested. 

Respectfully submitted, 

'^Jarett K. Abramson 
Attomey for Applicants 
Registration No. 47,376 




Enc: Version With Markings to Show Changes Made 
CUSTOMER NO. 
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VERSION WITH MARKINGS TO SHOW CHANGES MADE 
IN THE CLAIMS: 

The claims have been amended as follows: 

2. (Amended) [An] The E2NT dimer protein [according to] of Claim 1 wherein 
the residues lie on opposite sides of an N-terminal domain. 

3. (Amended) [An] The E2NT dimer protein [according to either preceding 
claim] of Claim L wherein the residues comprise a plurality of residue clusters 
associated with a structural role at an interface between Nl and N2 terminal domains 
of respective monomers within the dimer. 

4. (Amended) [An] The E2NT dimer protein [according to] of Claim 3 
comprising three clusters. 

5. (Amended) [An] The E2NT dimer protein [according to either of Claims] of 
Claim 3 [or 4] wherein a first cluster of vital residues is associated with interactions 
between Nl and N2 domains and comprises any one or more of the following 
residues: Ile82, Glu90, Trp92, Lysl 12, Tyrl38, Vall45. 

6. (Amended) [An] The E2NT dimer protein [according to any one of Claims 3- 
5] of Claim 3. wherein a second cluster of residues is associated with Nl interactions 
and comprises either or both of residues Trp33 and Leu94. 

7. (Amended) [An] The E2NT dimer protein [according to any one of Claims 3- 
6] of Claim 3, wherein a third cluster of residues is associated with N2 interactions 
and comprises any one or more of the following residuesi Pro 106, Lysl 11, Phel68, 
Trpl34. 

8. (Amended) [An] The E2NT dimer protein [according to any preceding claim] 
of Claim 1, further comprising residues associated with transactivation and/or 
replication properties of E2. 
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9. (Amended) [An] Ihe E2NT dimer [according to] of Claim 8^ wherein the 
residues comprise any one or more of the following residues! Glu20, GlulOO, Aspl22, 
Arg37, Glu39, Ile73, Glnl2 and Ala69. 

10. (Amended) [Use of a] A method for determining the structure of a crystallised 
molecular complex of an E2 N-terminal module (E2NT) dimer protein, [according to 
any preceding claim or homologue thereof in mapping mutations] wherein the E2NT 
dimer protein and any of its mutations are mapped onto an E2 three-dimensional 
stmcture so as to identify areas of amino acid conservation and the effect of mutations 
on folding of the E2 protein. 

11. (Amended) [Use] The method according to Claim 10 in rationalised antiviral 
drug design. 

12. (Amended) [An m vitro] A method for identifying and/or selecting a candidate 
therapeutic agent, the method comprisingi 

determining interaction of a E2 N-terminal module (E2NT) dimer in a sample 
by contacting said sample with said candidate therapeutic agent and measuring DNA 
loop formation in E2 in vitro . 

13. (Amended) [Use of the] The method according to Claim 12 [in] further 
comprising identifying and/or selecting an antiviral candidate therapeutic agent. 

14. (Amended) [Use according to] The method according to Claim 13, wherein 
[identification/selection] the identifying and/or selecting of the antiviral candidate 
therapeutic agent depends on its ability to interfere with or block interactions of E2NT 
so as to interfere or block viral and/or cellular transcription factors. 

15. (Amended) [Use of an E2NT dimerisation inhibitor for the preparation of a 
medicament for treatment of conditions that arise as a result of] A method of treating 
an HPV infection in a subject comprising: 



In re: Antson et al. 

Serial No.: PCT/GBOO/03568 
Filed: September 18, 2000 

Page 10 of 11 

introducing an E2NT dimerisation inhibitor in said subject. 

16. (Amended) [Use] The method according to Claim 15 [for the treatment of] 
further comprising treating warts, proliferative skin lesions and/or cervical cancer. 

1 8. (Amended) Use of a dimerisation surface of an crystallised molecular complex 
of an E2 N-terminal module (E2NT) dimer protein or homologue thereof according to 
[any one of Claims 1-9] Claim 1 as a target site for interaction with putative antiviral 
agents and/or for measuring efficacy of said agents. 

20. (Amended) [A] Ihe method of claim 19, wherein the method by which the 
E2NT crystal structure is obtainable comprises crystallisation using hanging-drop 
vapour diffusion. 

2 1 . (Amended) [A] The method of claim 1 9, [or claim 20] wherein the method by 
which E2NT crystal structure is obtainable comprises X-ray diffraction using uranium 
acetate and gold cyanide E2NT derivatives and refining with data extending to 1.9 A 
spacing. 

22. (Amended) [A] The method of [any of claims] Claim 19 [to 21], wherein the 
crystal structure comprises the portions of amino acids Ile82, Glu90, Trp92, Lysll2, 
Tyrl38, Vall45, Prol06, Lysl 11, Phel68, Trpl34, Trp33 and Leu94. 

23. (Amended) [A] The method of [any of claims] Claim 19 [to 22], wherein the 
rationalised drug design comprises designing drugs which interact with the 
dimerisation surface of E2NT. 

32. (Amended) A method for evaluating the ability of a chemical entity to 
associate with a molecule or molecular complex [according to claim 27 or claim 28] 
comprising a dimerisation surface defined bv structu re coordinates of E2NT amino 
acids Ile82- Glu90. Trp92. Lvsll2. Tvrl38, Va ll45. Prol06. Lvslll, Phel68, 
Trpl34. Trp33 and Leu94 according to Table 3 or a homolog ue of said molecule or 
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molecular complex, wherein said homologue comprises a bin ding pocket that has a 
root mean square deviation from the backbone atoms of said ami no acids of not more 
than 1.5 A. comprising the steps of: 



a. employing computational means to perform a fitting operation between the 
chemical entity and a dimerisation surface of the molecule or molecular complex; and 

b. analysing the results of said fitting operation to quantify the association 
between the chemical entity and the dimerisation surface. 



Claim 33 has been canceled without prejudice or disclaimer. 
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Target for Antiviral Therapy 



The present invention provides a crystallised module of a nuclear phosphoprotein and 
an assay and method for determining interactions with human papillomavirus E2 for 
u^e in drug design, for use particularly but not exclusively in designing antiviral 
agents with potential use in treating warts, proliferative skin lesions and carcinoma of 
the cervix. 
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isacl^round to the Invention 



Human papillomaviruses (HPVs) cause warts and proliferative lesions in skin and 
other epithelia. In a minority of HPV types ("high risk", which include HPVs 16, 18, 
31, 33, 45 and 56), further transformation of the wart lesions can produce tumours, 
most notably carcinoma of the cervix'. HPVs have evolved a sophisticated system of 
15 control, mediated by proteiniDNA and proteinrprotein interactions, that involves both 
cellular and viral proteins. The 45 kDalton nuclear phosphoprotein, E2, has two 
central roles in this control. It acts as the principal virally encoded transcription 
factor and, in association with the viral El protein, it creates the molecular complex 
at the origin of the viral DNA replication^. 

20 

E2 has three distinct modules. The N-terminal module (E2NT) of about 200 amino 
acids is responsible for interactions with viral and host cell transcription factors. It is 
followed by a flexible, proline-rich, linker module and a C-tenninal module (E2CT), 
each of about 100 amino acids ^ (Fig. la). The E2CT binds as a homodimer to DNA 
25 sites with a consensus sequence of ACCGN4CGGT In most HPVs a long upstream 
regulatory region (URR) precedes the viral genes and contains four spatially 
conserved E2 binding sites: three sites proximal to the transcription start site (p97 in 
HPV 16) and one approximately 500bp upstream. 



30 



The dimer of E2CT serves to anchor E2 protein to its recognition sites on the DNA, 
the function of the E2NT is to bind and localise at least three cellular transcription 
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factors, Spl, TFIIB and AMF-1, to the transcription initiation complex. In addition, 
E2 interacts with another viral protein. El, which has ATPase and helicase activities. 
El itself binds to the viral origin of replication which consists of about 100 bp and is 
surrounded by the three E2-binding sites, proximal to the transcription start. The 
5 E2:E1 interaction greatly increases the rate of HPV genome replication^'^'^ Fig. la. 
An intact E2 is essential for the normal productive (wart) life cycle of HPV, however 
during malignant progression HPV DNA is integrated into the host cell genome, 
which usually results in disruption of the E2/E1 ORFs and loss of E2 protein, in turn 
leading to dysregulated expression of the vkal oncogenes £6 and £7^. 

10 

Consistent with its role as a transcription regulator, E2 has been shown to direct the 
formation of loops in DNA containing E2 binding sites*. The loops were only 
formed with intact £2, and not with the E2CT alone. The E2 binding sites did not 
function independently and their co-operative effect was mediated by full length £2, 
15 leading the authors to suggest that there were specific interactions mediated by E2 
that bridged across the set of DNA binding sites through its N-terminal. A similar 
DNA loop structure could also be achieved with Spl, a cellular transcription factor, 
which forms a complex with distally boimd £2 ^; Spl/E2 interactions are critical for 
transcription activation in BPV*^. 

20 

Eighty six known E2 proteins from difiTerent species and different human subtypes 
are highly conserved, with sequence identities typically of 35% in the N and C- 
terminal modules (Fig. lb). The crystal structure of the E2CT has been determined 
both alone and in complex with cognate DNA*^*^"*. The module is a dimer v^dth a 
25 barrel fold, and induces substantial bending (42-44°) of the DNA from its B-form 
double helix 

TTie structure of the proteolytic fragment of HPV 18 E2NT, missing 65 N-terminal 
residues, was recently reported at 2.1 A spacing^ ^. This allowed some analysis of 
30 mutational effects on function, although the missing 65 amino acids contain residues 
which are essential for the transcriptional and replication activities of the protein. 
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We report herein the structure of the complete E2NT determined by X-ray analysis at 
1.9 A. We have found that it is an L-shaped molecule with the residues vital for 
transcriptional and replication activities of the protein lying on opposite sides of the 
N-terminal domain. Surprisingly, our results show that the surface, vital for 
5 transcription activation, is in fact involved in association of two E2NT's into a dimer. 
We suggest that dimerisation of E2NT plays an important and key role in induction 
of DNA loop formation, the mechanism by which distally boimd transcription factors 
would be brought close to the site of transcription initiation. More importantly, our 
results raise the possibility that dimer formation serves as a molecular switch 
10 between early gene expression and viral genome replication during HPV infection. 

The process of rationalised drug design requires no explanation or teaching for the 
skilled person but a brief description is given here of computational design for the lay 
reader: various computational analyses are necessary to determine whether a 
15 molecule is sufficiently similar to the target moiety or structure. Such analyses may 
be carried out in current software applications, such as the Molecular Similarity 
application of QUANTA (Molecular Simulations Inc., Waltham, Mass.) version 3,3, 
and as described in the accompanying User's Guide, Volume 3 pages. 134-135. 

20 The Molecular Similarity application permits comparisons between different 
structures, different conformations of the same structure, and different parts of the 
same structure. The procedure used in Molecular Similarity to compare structures is 
divided into four steps: 1) load the structures to be compared; 2) define the atom 
equivalences in these structures; 3) perform a fitting operation; and 4) analyze the 

25 results. 

Each structure is identified by a name. One structure is identified as the target (i.e., 
the fixed structure); all remaining structures are working structures (i.e., moving 
structures). When a rigid fitting method is used, the working structure is translated 
30 and rotated to obtain an optimum fit with the target structure. The fitting operation 
uses a least squares fitting algorithm that computes the optimum translation and 
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rotation to be applied to the moving structure, such that the root mean square 
difference of the fit over the specified pairs of equivalent atom is an absolute 
minimum. This number, given in angstroms, is reported by QUANTA. 

5 One skilled in the art may use one of several methods to screen chemical entities or 
fragments for their ability to associate with a target. Again, these methods require no 
elucidation for the skilled person but are described here for the benefit of the 
imskilled reader. The screening process may begin by visual inspection of the target 
on the computer screen, generated from a machine-readable storage medium. 
10 Selected fii^gnients or chemical entities may then be positioned in a variety of , 
orientations, or docked, within that binding pocket as defined supra. Docking may be 
accomplished using software such as Quanta and Sybyl, followed by energy 
minimization and molecular dynamics with standard molecular mechanics force 
fields, such as CHARMM and AMBER. 

15 

Specialized computer programs may also assist in the process of selecting fi-agments 
or chemical entities. These include: 

1 . GRID (P. J. Goodford, "A Computational Procedure for Determining Energetically 
20 Favorable Binding Sites on Biologically Important Macromolecules", J. Med. Chem., 

28, pp. 849-857 (1985)). GRID is available fi-om Oxford University, Oxford, UK. 

2. MCSS (A. Miranker et al., "Functionality Maps of Binding Sites: A Multiple Copy 
Simultaneous Search Method." Proteins: Structure, Fimction and Genetics, 11, pp. 

25 29-34 (1991)). MCSS is available fi^om Molecular Simulations, Burlington, Mass. 

3. AUTODOCK (D. S. Goodsell et al., "Automated Docking of Substrates to 
Proteins by Simulated Annealing", Proteins: Structure, Function, and Genetics, 8, pp. 
195-202 (1990)). AUTODOCK is available fi-om Scripps Research Institute, La JoUa, 

30 Calif 
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4. DOCK (L D. Kuntz et al., "A Geometric Approach to Macromolecule-Ligand 
Interactions", J, Mol. Biol., 161, pp. 269-288 (1982)). DOCK is available ftom 
University of California, San Francisco, Calif. 

S Once suitable chemical entities or fragments have been selected, they can be 
assembled into a single compound or complex. Assembly may be preceded by visual 
inspection of the relationship of the fragments to each other on the three-dimensional 
image displayed on a computer screen in relation to the structure coordinates of 
calcineurin. This would be followed by manual model building using software such 
10 as Quanta or Sybyl. 

Useful programs to aid one of skill in the art in connecting the individual chemical 
entities or fragments include: 

1. CAVEAT (P. A. Bartlett et al, "CAVEAT: A Program to Facilitate the Structure- 
15 Derived Design of Biologically Active Molecules". In Molecular Recognition in 
Chemical and Biological Problems", Special Pub., Royal Chem. Soc, 78, pp. 182- 
196 (1989)). CAVEAT is available from the University of California, Berkeley, 
Calif. 

20 2. 3D Database systems such as MACCS-3D (MDL Information Systems, San 
Leandro, Calif). This area is reviewed in Y. C. Martin, "3D Database Searching in 
Drug Design", J. Med. Chem., 35, pp. 2145-2154 (1992). 

3. HOOK (available from Molecular Simulations, Burlington, Mass.). 

25 

As the skilled reader will already know, instead of proceeding to build ligand for the 
target in a step-wise fashion, one fragment or chemical entity at a time as described 
above, inhibitory or other target-binding compounds may be designed as a whole or 
de novo. These methods include: 
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1. LUDI (H.-J. Bohm, "The Computer Program LUDI: A New Method for the De 
Novo Design of Enzyme Inhibitors", J. Comp. Aid. Molec. Design, 6, pp. 61-78 
(1992)). LUDI is available from Biosym Technologies, San Diego, Calif. 

5 2. LEGEND (Y. Nishibata et al.. Tetrahedron, 47, p. 8985 (1991)). LEGEND is 
available from Molecular Simulations, Burlington, Mass. 

3. LeapFrog (available from Tripos Associates, St. Louis, Mo.). 

10 Other molecular modelling techniques may also be employed. See, e.g., N. C. Cohen 
et al., "Molecular Modeling Software and Methods for Medicinal Chemistry, J. Med. 
Chem., 33, pp. 883-894 (1990). See also, M. A. Navia et al., "The Use of Structural 
Information in Drug Design", Current Opinions in Structural Biology, 2, pp. 202-210 
(1992). 

15 

Once a compoimd has been designed or selected by the above methods, the efficiency 
with which that entity may bind to a target may be tested and optimized by 
computational evaluation. For example, an effective ligand will preferably 
demonstrate a relatively small difference in energy between its bound and free states 

20 (i.e., a small deformation energy of binding). Thus, the most efficient ligands should 
preferably be designed with a deformation energy of binding of not greater than about 
10 kcal/mole, preferably, not greater than 7 kcal/mole. Ligands may interact with the 
target in more than one conformation that is similar in overall binding energy. In 
those cases, the deformation energy of binding is taken to be the difference between 

25 the energy of the free entity and the average energy of the conformations observed 
when the inhibitor binds to the protein. 

An entity designed or selected as binding to a target may be frirther computationally 
optimized so that in its bound state it would preferably lack repulsive electrostatic 
30 interaction with the target enzyme. Such non-complementary (e.g., electrostatic) 
interactions include repulsive charge-charge, dipole-dipole and charge-dipole 
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interactions. Specifically, the sum of all electrostatic interactions between the 
inhibitor or other ligand and the target, when the inhibitor is bound to the target, 
preferably make a neutral or favourable contribution to the enthalpy of binding. 

5 Specific computer software is available in the art to evaluate compound deformation 
energy and electrostatic interaction. Examples of programs designed for such uses 
include: Gaussian 92, revision C [M. J. Frisch, Gaussian, Inc., Pittsburgh, Pa. 
.COPYRGT.1992]; AMBER, version 4.0 [P. A. KoUman, University of California at 
San Francisco, .COPYRGT.1994]; QUANTA/CHARMM [Molecular Simulations, 
10 Inc., Burlington, Mass. .COPYRGT.1994]; and Insight II/Discover (Biosysm 
Technologies Inc., San Diego, Calif .COPYRGT.1994). These programs may be 
implemented, for instance, using a Silicon Graphics workstation, IRIS 4D/35 or IBM 
RISC/6000 workstation model 550. Other hardware systems and software packages 
will be known to those skilled in the art. 

15 

Once the ligand has been optimally selected or designed, as described above, 
substitutions may then be made in some of its atoms or side groups in ord^ to 
improve or modify its binding properties. Generally, initial substitutions are 
conservative, i.e., the replacement group will have approximately the same size, 
20 shape, hydrophobicity and charge as the original group. It should, of course, be 
understood that components known in the art to alter confomiation shoidd be 
avoided. Such substituted chemical compounds may then be analyzed for efficiency 
of fit to a calcineurin-like binding pocket by the same computer methods described in 
detail, above. Again, all these facts are familiar to the skilled person. 

25 

Another approach is the computational screening of small molecule data bases for 
chemical entities or compounds that can bind in whole, or in part, to a target. In this 
screening, the quality of fit of such entities to the binding site may be judged either 
by shape complementarity or by estimated interaction energy. E. C. Meng et al., J. 
30 Comp. Chem., 13, pp. 505-524 (1992). 
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The computational analysis and design of molecules, as well as sofhvare and 
computer systems therefor are described in US Patent No 5,978,740 which is 
included herein by reference, including specifically but not by way of limitation the 
computer system diagram described with reference to and illustrated in Fig 3 thereof 
S as well as the data storage media diagram described with reference to and illustrated 
in Fig 4s and 5 thereof. 

Statement of the Invention 

10 According to a first aspect of the invention there is provided a crystallised molecular 
complex of an E2 N-terminal module (E2NT) dimer protein or homologue thereof, 
for use in rationalised drug design. We have found that the dimer comprises residues 
vital for transcriptional and replicational activities of said protein lying on opposite 
sides of an N-terminal domain, for use in rationalised drug design. 

15 

Preferably the E2NT dimer protein is substantially as depicted in any of Figures 2c 
and/or 3a-d, 

According to a second aspect of the invention there is provided an in vitro method for 
20 identifying and/or selecting a candidate therapeutic agent, the method comprising 

determining interaction of a E2 N-terminal module (E2NT) dimer in a sample by 
contacting said sample with said candidate therapeutic agent and measuring DNA 
loop formation. 

25 Preferably, the method is for use in identifying and/or selecting an antiviral candidate 
therapeutic agent. 

Preferably, the candidate therapeutic agent interferes or blocks interactions of E2NT 
so as to interfere or block viral and/or cellular transcription factors. 

30 
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According to a third aspect of the invention there is provided use of an E2NT 
dimerisation inhibitor in the preparation of a medicament for use in treating warts, 
proliferative skin lesions and/or cervical cancer. 

5 According to a fourth aspect of the invention there is provided a method of 
monitoring the efficacy of an antiviral therapy in a patient receiving a medicament for 
the treatment of warts, proliferative skin lesions and/or cervical cancer comprising 
taking a sample from said patient and measuring E2NT interactions and/or DNA loop 
formation. 

10 

Thus it will be appreciated that a patient can be monitored at the start of therapy to 
test its effectiveness. Alternatively, a patient can be monitored once a therapy has 
been established so as to monitor its efiScacy with a view to altering a therapy if 
foimd to be unsatisfactory. 

The human papillomavirus E2 protein controls the primary transcription and 
replication of the viral genome. Both activities are govemed by a -200 amino acid 
N-temiinal module (E2NT) which is connected to a DNA binding C-terminal module 
by a flexible linker. The crystal structure of the E2NT module from high-risk type 16 

20 human papillomavirus reveals an L-shaped molecule with two closely packed 
domains, each with a novel fold. It forms a dimer in the crystal and in solution. The 
dimer structure is important in the interactions of E2NT with viral and cellular 
transcription factors and is the key to induction of DNA loops by E2. These loops 
may serve to target distal DNA-binding transcription factors to the region proximal to 

25 the start of transcription. The structure has implications for antiviral drug design and 
cervical cancer therapy. 

The invention includes method for identifying and/or selecting a candidate 
therapeutic agent, comprising applying rationalised drug design to a crystal structure 
30 obtainable by crystallising E2NT, cryogenicaliy freezing the crystals and generating 
the crystal structure using X-ray diffraction. The method by which the E2NT crystal 
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Structure is obtainable may comprise crystallisation using hanging-drop vapour 
diffosion. The method by which E2NT crystal structure is obtainable may comprise 
X-ray dif&action using uranium acetate and gold cyanide E2NT derivatives and 
refining with data extending to 1.9 A spacing. The crystal structure may comprise 
5 the portions of amino acids Ile82, Glu90, Trp92, Lysll2, Tyrl38, Vall45, Prol06, 
LysUl, Phel68, Tip 134, Trp33 and Leu94. The rationalised drug design may 
comprise designing drugs which interact with the dimerisation surface of E2NT. 

Further provided is a computer for producing a three-dimensional representation of a 
10 molecule or molecular complex, wherein said molecule or molecular complex 
comprises or a three-dimensional representation of a homologue of said molecule or 
molecular complex, wherein said homologue comprises a binding pocket that has a 
root mean square deviation from the backbone atoms of said amino acids of not more 
than 1.5 A , wherein said computer comprises: 

15 

(a) a machine-readable data storage medium comprising a data storage material 
encoded with machine-readable data, wherein said data comprises the structure 
coordinates of E2NT amino acids Ile82, Glu90, Trp92, Lysll2, Tyrl38, Vall45, 
Prol06, Lysl 1 1, Phel68, Trpl34, Trp33 and Leu94 according to Table 3; 

20 

(b) a working memory for storing instructions for processing said machine-readable 
data; 

(c) a central-processing unit coupled to said working memory and to said machine- 
25 readable data storage medium for processing said machine readable data into said 

three-dimensional representation; and 

(d) a display coupled to said central-processing unit for displaying said three- 
dimensional representation. 

30 
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In class of embodiments, the three-dimensional representation is of a molecule or 
molecular complex is defined by the set of structure coordinates according to Table 
3, or wherein said three-dimensional representation is of a homologue of said 
molecule or molecular complex, said homologue having a root mean square 
5 deviation from the backbone atoms of said amino acids of not more than 1 .5 A. 

An additional aspect of the invention resides in a computer for determining at least a 
portion of the structure coordinates corresponding to an X-ray diffraction pattern of a 
molecule or molecular complex, wherein said computer comprises: 

10 

(a) a machine-readable data storage medium comprising a data storage material 
encoded with machine-readable data, wherein said data comprises at least a portion 
of the structural coordinates according to Table 3; 

15 (b) a machine-readable data storage medium comprising a data storage material 
encoded with machine-readable data, wherein said data comprises an X-ray 
diffraction pattern of said molecule or molecular complex; 

(c) a working memory for storing instructions for processing said machine-readable 
20 data of (a) and (b); 

(d) a central-processing unit coupled to said working memory and to said machine- 
readable data storage medium of (a) and (b) for performing a Foxirier transform of the 
machine readable data of (a) and for processing said machine readable data of (b) 

25 into structure coordinates; and 

(e) a display coupled to said central-processing unit for displaying said structure 
coordinates of said molecule or molecular complex. 

30 A yet further aspect of the invention relates to a crystallised molecule or molecular 
complex comprising a dimerisation surface defined by structure coordinates of E2NT 
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amino acids Ile82, Glu90, Trp92, Lysll2, Tyrl3S, Vall45, Prol06, Lyslll, Phel68, 
Trpl34, Trp33 and Leu94 according to Table 3or a homologue of said molecxile or 
molecular complex, wherein said homologue comprises a binding jiocket that has a 
root mean square deviation from the backbone atoms of said amino acids of not more 
5 than 1.5 A. The molecule or molecular complex may be defined by the set of 
structure coordinates according to Table 3, or a homologue thereof, wherein said 
homologue has a root mean square deviation from the backbone atoms of said amino 
acids of not more than 1 .5 A. 

10 27. A machine-readable data storage medium (e.g. a magnetic or optical storage 
medium, for example a hard disc, a floppy disc or a CD-ROM), comprising a data 
storage material encoded with machine readable data which, when using a machine 
programmed with instructions for using said data, is capable of displaying a graphical 
three-dimensional representation of a molecule or molecular complex comprising a 

1 5 dimerisation surface defined by structure coordinates of E2NT amino acids Ile82, 
Glu90, Trp92, Lysll2, Tyrl38,.Vall45, Prol06, Lysll 1, Phel68, Trpl34, Trp33 and 
Leu94 according to Table 3, or a homologue of said molecule or molecular complex, 
wherein said homologue comprises a binding pocket that has a root mean square 
deviation from the backbone atoms of said amino acids of not more than 1.5 A. 

20 

In the machine-readable data storage medixim the molecule or molecular complex 
may be defined by the set of structure coordinates according to Table 3, or a 
homologue of said molecule or molecular complex, said homologue having a root 
mean square deviation from the backbone atoms of said amino acids of not more than 
25 1.5 A. 

The invention further provides a machine-readable data storage medium comprising a 
data storage material encoded with a first set of machine readable data which, when 
combined with a second set of machine readable data, using a machine programmed 
30 with instructions for using said first set of data and said second set of data, can 
determine at least a portion of the structure coordinates corresponding to the second 
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set of machine readable data, wherein: said first set of data comprises a Fourier 
transform of at least a portion of the structural coordinates according to Table 3; and 
said second set of data comprises an x-ray diffraction pattern of a molecule or 
molecular complex. 

5 

In another aspect, the invention resides in a method for evaluating the ability of a 
chemical entity to associate with a molecule or molecular complex according to the 
invention, comprising the steps of: 

10 a. employing computational means to perform a fitting operation between the 
chemical entity and a dimerisation surface of the molecule or molecular complex; 
and 

b. analysing the results of said fitting operation to quantify the association 
between the chemical entity and the dimerisation surface. 

15 
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Detailed Description of the Invention 

The invention will now be described by way of example only with reference to the 
following Figures and Tables wherein: 

5 

Table 1 illustrates X-ray data and phasing statistics; 

Table 2 illustrates refinement and model correlation; 

10 Table 3 shows the structure coordinates of the E2NT module; 

Figure la represents functional assignments of HPV 16 E2 protein; 

Figure lb illustrates sequence alignment of E2NT modules from a subset of HPV 
15 types; 

Figure 2a illustrates a stereo view of electron density with a final model at the dimer 
interface of the E2NT module, viewed down the crystallographic two-fold axis; 

20 Figure 2b represents a stereo ribbon diagram of the E2NT module; 

Figure 2c represents the E2NT dimer; 

Figure 3a illustrates a schematic view of URR; 

25 

Figure 3b illustrates a schematic view of loop formation induced by binding of E2 
proteins to two cognate sites; 

Figure 3c illustrates a model of E2 dimer formation; 

30 

Figure 3d illustrates loops within URR as shown in Figure 3b; 

14 
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Figure 4a illustrates the distribution of conserved residues on the E2NT monomer; 
Figure 4b illustrates a first cluster of conserved residues on the E2NT monomer; 

5 

Figure 4c illustrates a second cluster of conserved residues on the E2NT monomer; 
and 

Figure 4d illustrates conserved residues Gin 12 and Glu39. 

10 

Those of skill in the art understand that a set of structure coordinates for an enzyme 
or an enzyme-complex or a portion thereof, is a relative set of points that define a 
shape in three dimensions. Thus, it is possible that an entirely different set of 
coordinates could define a similar or identical shape. Moreover, slight variations 
15 caused by acceptable errors in the individual coordinates will have little, if any effect 
on overall shape. In terms of binding pockets, these acceptable variations would not 
be expected to alter the nature of ligands that could associate with those pockets. 

The term "associating with" refers to a condition of proximity between a chemical 
20 entity or compound, or portions thereof, and a calcineurin molecule or portions 
thereof. The association may be non-covalent~wherein the juxtaposition is 
energetically favored by hydrogen bonding or van der Waals or electrostatic 
interactions—or it may be covalent. 

25 The invention is also described with reference to US Patent No 5,978,740 which is 
included herein by reference, including specifically but not by way of limitation the 
computer system diagram described with reference to and illustrated in Fig 3 thereof 
as well as the data storage media diagram described with reference to and illustrated • 
in Fig s 4 and 5 thereof. 

30 
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With reference to Figure la and functional assignnfients of E2. There is shown in a 
schematic view of NT, linker and CT modules of E2 mdicating known functions of 
each module. Amino acid numbers which delimit the modules correspond to E2 
from HPV16, In Figure lb, there is shown the sequence alignment of the E2NT 
5 modules from a subset of HPV types (HPV16, HPV18, HPVl 1 and HPV2a) and one 
BPV type. Shaded blocks above the alignment indicate the experimentally 
detennined secondary structure. Shaded blocks below the sequences indicate the 
minimal peptide sequences involved in proteiniprotein interactions, suggested by 
mutation studies. Residues with more than 90% identity among 86 PV types are 
10 coloured: red for internal structural residues, green for residues within the fulcrum 
region, blue for surface residues. 

With reference to the structural features of E2, in Figure 2a there is shown a stereo 
view of the electron density with the final model, at the dimer interface of the E2NT 

15 module, viewed down the crystallographic two-fold axis. The likelihood weighted 
map is contoured at the 1.5 a level. Ribbons of two independent monomers are 
coloured blue and yellow. Side chains of ARG37 and IIe73 which are known to be 
critical for transactivation "^'^^ , are shown in dark green; side chain of other residues 
at the dimer interface are shown in light green. Oxygen atoms are in red, nitrogen in 

20 blue, water molecules are shown as orange spheres and hydrogen bonds as dashed 
sticks. In Figure 2b, there is shown a stereo ribbon diagram of the E2NT module. 
The Nl domain is shown in aquamarine and the N2 domain in pink, with the fulcrum 
in green. In Figure 2c, there is shown the dimer of E2NT, showing the extent of the 
interface between the two subunits. The view is as in Figure 2a but rotated clockwise 

25 by 90°. Side chains of Gin 12 and Glu39 which are critical for interactions with El 
^^'^^ are shown in magenta. Side chains of residues at the dimer interface are 
coloured as per Figure 2a. 

With reference to Figures 3a-d there is shown loop formation in the URR of HPV 16. 
30 In Figure 3 a, there is shown a schematic view of the URR, The four E2-binding sites 
are represented by boxes. Numbers in italics indicate distances between individual 

16 
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sites upstream of the p97 promoter. Two possible E2 configurations, with separate or 
dimeric E2NT modules are shown. In Figure 3b, there is shown a schematic view of 
loop formation induced by binding of E2 proteins to two cognate sites, based on the 
experiments reported by Knight et al^. In Figure 3d, there is shown the possible DNA 
5 loops within the URR as depicted in Figure 3b. In Figure 3 c, there is shown a model 
of the formation of E2 dimers, showing interactions between both the C-terminal and 
E2NT modules. The C-tenninal dimer, with its bound DNA, is based on the crystal 
structure of this module^^. The E2NT dimer is proposed from the present work. The 
relative orientation and position of the E2NT and C-tenninal modules is purely 
10 schematic. 

With reference to Figures 4a-d there are shown functionally important residues. In 
Figure 4a, there is shown the distribution of conserved residues on the E2NT 
monomer. In Figures 4b and 4c there is shown the two clusters of conserved residues 
15 in the fulcrum of E2NT. In Figure 4d, there are shown conserved residues Glnl2 and 
Glu39. Bonds in ball*and stick models are coloured aquamarine (Nl domain), pink 
(N2 domain) and green (fulcrum). Hydrogen bonds are shown as dashed lines, water 
molecules as orange spheres, oxygen atoms are in red, nitrogen atoms in blue and 
sulphur atoms in yellow. 

20 

There is convincing evidence that the E2 protein has an extended structure, is flexible 
and that its functions depend on this property. This is probably the reason why the 
intact protein has not yet been crystallised in spite of intensive efforts. A major 
problem is the extended flexible linker module, with around 100 residues. E2NT 

25 proved difficult to crystallise, and a number of different constructs were made and 
overexpressed before crystallisation with residues 1 to 201 was achieved, but even 
this construct possessed limited stability. The protein had to be crystallised within 2- 
3 days of purification; crystals grew within about 48 hours but only retained useful 
dif&action quality for a further 2-3 days. This necessitated that crystals be rapidly 

30 vitrified in cryoprotectant buffer and stored for use as soon as detector time became 
available^^ 
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Crystals of E2NT belong to the space group P3i21 with unit-cell dimensions 
a=b=54.3A, c=155,5A. The structure was determined using two heavy atom 
derivatives and refined with data extending to 1.9 A spacing (Fig. 2a). The main 
5 chain is well defined throughout with the exception of residues 125 and 126 which 
are in an exposed loop and are mobile. There was density for the last residue of the 
His-tag at the N-terminus, but none for the remainder of this entity. All amino acids 
lie in the allowed regions of the Ramachandran ((|),\|/) plot*^ with 92.4% in most 
favoured regions^ ^. 

10 

The transactivation module is composed of two domains, Nl and N2, arranged so as 
to give it an overall L-shaped appearance. Analysis of the PDB*^using DALI^**shows 
that both have unique organisation of their secondary structures. Domain Nl, which 
forms the N-terminus of the intact E2, is composed of residues 1 to 92, which fold 

15 into three long a-helices. Figure 2 (b,c). There is a tight loop between al and a2 and 
a more extended one between a2 and a3. The three helices pack antiparallel to one 
another in the form of a twisted plane, with angles of about 20° and 25° between the 
pairs of consecutive helices. DALI indicated a maximum Z-score of 5.7, that could 
suggest a significant correlation, for colicin la, a membrane protein which contains 

20 three 80 A long a-helices arranged more or less coplanar^^ This is the only other 
known protein that contains a true domain made up of such a packing of three 
helices. In addition there were 42 other structures which gave Z-scores above 4.0, 
most of which were four helix bundles, such as bacterioferritin^^. However, in these 
only two of the three Nl helices superimposed simultaneously on two, not always 

25 adjacent, bundle helices as a result of a more planar arrangement of helices within 
Nl. The indications are that the similarities observed reflect the optimum stacking 
angle of antiparallel heUces against one another rather than suggesting a common 
ancestor for the evolution of these molecules. 

30 Domain N2 is made up of residues 110 to 201 and is composed almost entirely of 
antiparallel P structure, with only one short helical segment firom residues 171 to 178, 
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Figure 2 (b,c). The secondary structure has two short three and four stranded 
antiparallel p pleated sheets interconnected by two stranded P ribbons. For this 
domain DALI failed to identify any significant homologies to known structures, with 
a highest Z-score of only 2.1. From the analysis of Harris and Botchan^^ and the 
5 present study, the N2 fold appears to be novel. 

The structure between the Nl and N2 domains (residues 93 to 109) contains two 
consecutive single turns of helical structure, resulting in a compact and tight turn. It 
packs closely against elements of both domains and is not a truly independent 
10 structural domain. Rather it forms a fulcrum in the L-shape formed by Nl and N2 
where it could act as a hinge, allowing the two domains to change their relative 
conformation in a specific way. Several of the interactions between adjacent regions 
of chain in the fulcrum are mediated indirectly through H-bonds involving water 
molecules, suggesting the possibility of flexibility. 

15 

One of the most striking features of the crystal structure is the association of two 
E2NT monomers into a tight dimer. The two E2NT monomers pack around the 
crystallographic 2-fold axis, as shown in Figure 2a. The dimer interface is formed 
mostly by amino acids from helices a2 and a3 of the Nl domain and by residues 
20 142-144 from the N2 domain. The total buried surface area between the two E2NT is 

2026 A"* , comparable to the 2444 A** buried between the two E2CT , which are 
known to form a tight dimer with a Kd of 3-6 x 10"* M 

In the E2NT dimer interface, each subunit contributes a cluster of seven equivalent 
residues, invariant or conserved in the 86 known sequences of E2^^ with many direct 

25 and water-mediated hydrogen bonds and rather few non-polar contacts. Fig. 2. 
Analysis of the dimer forming surfaces shows that all the direct hydrogen bonds 
between monomers are made through these seven amino acids. For the invariant 
Arg37, all possible side-chain hydrogen bonds are made and all are well defined. 
Figure 2. Three of them are across the dimer interface. One hydrogen bond is 

30 critical, from NH2 to the main chain carbonyl oxygen of Leu77. A second hydrogen 
bond from NH2 is to OGl of ThrSl; in five out of 86 sequences this residue is 
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glutamine, and modelling shows a hydrogen bond is possible to the NE of Arg37. 
The NHl of Arg71 H-bonds to the OEl of residue 80, which is Glu or Gin in all but 
six variants. At the NE of Arg37 there is an ideal H-bond to water that itself makes 
another strong H-bond across the dimer interface to the main-chain carbonyl oxygen 
5 of residue 142. The role of the invariant Ile73 is the filling of the intersubunit non- 
polar volume made up of the aliphatic parts of Arg37, Gln76 and of Leu77 - in this 
case from both monomers. The Leu77 is in a few sequences substituted by valine or 
isoleucine and in 9 out of 86 known sequences by methionine. Inspection of the 
structure shows that Leu77 is partially exposed to the solvent and therefore different 

10 hydrophobic side chains could be easily accommodated at this site. Another 
important non-polar side chain is Ala69. Its side chain methyl packs into the surface 
of the other monomer at van de Waals distance from the main chain of residue 142. 
The only observed mutation of Ala69 is to Gly, and is easily accommodated. Gln76 
is conserved or has homologous substitutions in about 2/3 of E2 sequences; in about 

15 1/4 of the sequences there is methionine or valine at this position^ ^ Although 
hydrophobic substitutions of Gln76 would disrupt the hydrogen bonding to Glu80 
across the dimer interface, and to Arg37 from the same subunit, the hydrophobic side 
chain at residue 76 could instead make a compensating hydrophobic interaction with 
the adjacent intersubimit hydrophobic pocket formed by Ile73 and Leu77. 

20 Modelling of the amino acid variations in the 86 known papillomavirus E2 proteins 
into the other contacts at the dimer interface shows that they generally can be 
accommodated (data not shown). The consistency of the hydrogen bonds and van de 
Waals contacts at the monomer-monomer interface in the various sequences suggests 
therefore that the E2NT dimer interactions are potentially present in all 

25 papillomaviruses. 

The first experimental evidence for the E2NT dimerisation in the presence of DNA 
with multiple E2-binding sites was provided by Knight et al in 1991*. Their studies 
showed that intact E2 led to the formation of DNA loops on templates with widely 
30 separated E2 binding sites, while a truncated E2, containing the DNA-binding E2CT 
but missing the N-terminal 161 residues, did not. Such dimerisation is fiirther 
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supported by the observed synergistic transcription activation by a complex of two 
DNA-bound E2 dimers^. 

To analyse the functional behaviour of the E2NT dimers fiirther, we measured the 
5 dissociation constant by sedimentation eqxiilibrium using analytical 
ultracentrifugation of recombinant E2NT protein containing the 201 N-tenninal 
amino acids. A value of = 8.1 ± 4 x 10"* M was obtained, indicating medixrai- 
strength association. The micromolar range of the E2NT dimer is certainly 
physiologically significant, and compares well with values for other transcription 

10 factors which have relatively low dissociation constants, often with the ATa values 
between 1 and 20 |iM In vivo, the interaction could be enhanced when the 
two E2NT modules are placed in close proximity. Indeed, E2CT forms dimers which 
bind to the multiple DNA-binding sites located within the URR of viral DNA with 
Kfi of proteiniDNA interactions usually in the nanomolar range^^. Consequently, the 

15 local concentration of E2NT, bound to the E2CT via the non-conserved, flexible ~80 
amino-acid linker, is effectively increased. 

E2NT dimer interactions, as seen in the crystal structure, could form either between 
modules which are already part of a single E2 dimer, formed as a result of E2CT 

20 dimerisation interactions and bound to a single E2 binding site on the DNA (Fig. 3a), 
or between two preformed E2 dimers located on different E2 binding sites (Fig. 3b). 
The results of the electron microscopy suggest that the latter dimerisation does 
occur^. Although no direct experimental evidence exists for the former dimerisation, 
it does also seem possible due to the flexibility of the linker connecting the two 

25 modules. We propose that E2 molecules may initially keep their N-terminal modules 
within their internal dimers, but swap N-terminal modules and cross link to E2 
molecules bound to distant DNA binding sites to form active loop structures during 
transcriptional activation and / or HPV DNA replication (Figure 3d). As discussed 
below, the effects of mutations on transcriptional transactivation can be explained in 

30 terms of the dimer being an essential element in this process. 
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E2 is a regulator of both transcription and viral DNA replication and thus interacts 
with other viral and host macromolecules in the infected cell. Indication of the 
possible importance of individual residues in the fiinction comes firstly fi*om the 
structure, secondly from the extensive set of sequences of the papillomaviral E2*s 
5 and thirdly from mutagenesis studies on the individual proteins. In the following we 
make a primary attempt to map the molecule's function onto its structure. 
The pattern of amino acid conservation for the 86 available papilloma sequences 
has been analysed using the GCG program suite^^. The sequences exhibit striking 
variation, characteristic of some virus families. However, 33 of the total 201 

10 residues in the E2NT construct were totally or highly conserved. Fig. 4a illustrates 
the distribution of these 33 residues in the dimer. These were categorised into two 
sets: those with an essentially structural role and those exposed on the surface with a 
potential for intermolecular interactions. Thirteen residues (Fig. lb) are buried or 
play a purely structural role within the monomer, they are not expected to be of 

1 5 functional importance and will not be discussed here. 

A further 12 of these 33 residues stand out as having a structural role in the interface 
of the Nl and N2 domains. They form three clusters, the first making direct 
interactions between the two domains (Ile82, Glu90, Trp92, Lysll2, Tyrl38, 

20 Vail 45) and two separate sets of interactions, one from N2 (Pro 106, Lysl 11, Phel68, 
Trpl34) and the other from Nl (Trp33, Leu94) to the structure connecting them, 
referred to here as a fulcrum. The first two clusters are shown in Figure 4 b, c and it 
can be seen that Lysl 11 and Lysl 12 play key roles. Their side chains point in 
opposite directions to one another and their terminal amino groups are involved in 

25 near ideal patterns of hydrogen bonds. The flat surfaces of their extended side chains 
stack against Trpl34 and Trp92, respectively. This clustering of invariant residues at 
the interface indicates a functional importance for the relative orientation of Nl and 
N2. The fulcrum could indeed provide a flexible pivot between the two domains, but 
there is no direct evidence for this as yet. Finally, while the side chain of Glu90 is 

30 held tightly in place by two H-bonds and could have a structural role, its OE2 atom is 
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exposed on the surface and is surrounded by near invariant side-chains, which may 
thus play a part in interactions with other molecules. 

Of the remaining eight conserved residues, mutational substitutions of Glu20, 
5 GlulOO and Asp 122 ^^'^^ had moderate effects on the transactivation and replication 
properties of E2, which depended on a particular viral strain, Olu20 lies on the top 
surface of NL Aspl22 lies far away on the distal sxirface of N2. GlulOO is 
completely exposed and points into the solvent at the junction of the L between the 
Nl and N2 domains. The functional role of these amino acids has yet to be clarified. 

10 

Three conserved amino acids (Arg37, Glu39 and Ile73) have been subjected to point 
mutation and the effects on the two principal functions of E2, i.e. transactivation and 
HPV DNA replication have been assessed (reviewed in'*,also ^^-^'^^y Together with 
the remaining two conserved amino acids, GInl2 and Ala69, these residues form two 
1 5 functionally important surfaces (see below). 

Finally, a number of the mutational results (reviewed in ^, also ^^*^^'^^) correspond to 
residues that can be assigned to structural roles. Substitution of these residues will 
lead to substantial conformational changes and a probable inability to fold correctly. 
20 This is particularly true for some of the deletion mutants involving the core of the 
molecule. Knowledge of the structure will allow a more rational choice and design 
of mutants in the future. 

The induction of DNA loops by E2NT dimerisation could be important for the 
25 construction of the active transcription bubble by targeting DNA-binding 
transcription factors, bound at distal sites, to the region proximal to the start of 
transcription (reviewed in ^^). In support of this, residues Arg37, Ile73 and Gln76 
map onto the surface of E2NT involved in dimer formation, and mutations result in 
considerable disruption of transactivation, while having little effect on replication, 
30 The structure also shows that Ala69 which points its side chain methyl across 

the dimer interface, is also critical for transactivation. Mutational substitutions to 
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amino acids with longer side chains should have a knock out effect on E2NT dimer 
fonnation and consequently on transactivation. 

The sites of association with cellular transcription factors AMF-1 (residues 74-134) 
5 and TFIIB (134-216) were previously mapped onto the E2NT module (Figure 1) 
using a series of deletion mutants as well as point mutations^*'^^. These sites were 
mutually exclusive. In the structure, residues 74-134 include the fulcrum, while 
residues 134-216 correspond to domain N2. Further biochemical and structural 
studies can now be planned to characterise these interactions in more detail. 

10 

Replication of the viral genome is initiated by binding of another viral protein. El, to 
the origin of DNA replication"* which is itself flanked by two E2 binding sites. Fig. 
3a. While the function of E2CT dimers is to bind specifically to the DNA sites, 
E2NT interaction with El enhances the binding of El to this region. Mutational 

15 substitutions of Glii39 generally retained transcriptional activation while DNA 
replication was substantially reduced^ In the structure, the conserved Glu39 
makes every possible hydrogen bond by its side chain carboxyl oxygens, Fig. 4d. 
One hydrogen bond is to NE2 of Gin 12, which is absolutely conserved in all known 
sequences of E2. The other three hydrogen bonds are to the water molecules which 

20 are part of an intimate net of well-defined water molecules surrounding Glu39 and 
mediating its interactions with adjacent residues. Interestingly, a number of these 
protein interactions with water molecules are conserved as they are made to the 
protein backbone, including carbonyl oxygens of Glnl2, Met36 and Lys68. While 
mutation of Gin 12 in BP VI only slightly affected both transactivation and 

25 replication, it substantially reduced cooperative origin binding^^*^^. The close 
positioning of Glnl2 and Glu39 in the three-dimensional structure further enhances 
the notion that these two resides are involved in interactions with El. The conserved 
set of interactions at Glnl2/Glu39 suggests that the main chain carbonyl oxygens of 
Gin 12 and Met36 and the conserved water molecules could be also involved in these 

30 interactions. Glnl2/Glu39 are surrounded by Leu8, Ilel5, Met36, Tyr43, Gln57 and 
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Lys68, which are unlikely to contribute into E2/E1 interactions, as these residues are 
not well conserved in E2 sequences from different papillomaviruses. 

The Glnl2/Glu39 cluster lies on a side of the Nl domain which is opposite to the 
side involved in transEictivation (and dimerisation). Figure 2c. Notably, the spatial 
separation of the two functionally important surfaces suggests that E2NT module 
could be able to interact with El at the same time as it interacts through the 
dimerisation interface with another E2NT module, 

The structure reported here for the entire E2 transactivation module, has several 
implications for understanding of E2 function. It is now possible to map known 
mutations onto the E2 three-dimensional structure, and to use the knowledge of 
amino acid conservation and the effects of mutations to assign roles in folding, 
structure and function to residues. To this end, our results indicate that molecular 
surfaces involved in transactivation and El -binding are located at opposite sides of 
the Nl domain of E2NT, suggesting that both surfaces could be accessed 
simultaneously by other protein factors. In line with these observations. El has been 
shown to modulate transactivation by directly interacting v^th E2, leading to 
repression of transactivation in the presence of excess El^^. It is not inconceivable 
that the docking of E2NT dimer with El is sufficient to block further association 
with other target proteins. 

The structure shows that the transactivation surface is involved in the formation of 
the E2NT dimer, which could cross-link E2 molecules bound by their E2CT modules 
to well-separated DNA sites. Inevitably, such dimerisation would cause DNA to 
form a loop structure, targeting distally bound transcription factors to regions close to 
the promoter. While this process has been suggested to be essential for 
transactivation^^, the definition of interacting surfaces between E2 and other cellular 
transcription factors requires a great deal of fiirther study. 
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Our results suggest that the process of DNA loop formation could involve swapping 
of E2NT modules across E2 dimers bound at separated DNA sites (Fig. 3a-d). The 
polar components of the monomer-monomer interactions may favour such exchange. 
Domain swapping is a well-recognised phenomenon that occurs relatively frequently 
5 between two individual monomers containing domains connected by a flexible linker 
E2 is to our knowledge the first example where the swapping event is predicted 
to occur between dimers. 

The dimerisation surface of E2 represents a good target for designing anti-viral drugs, 

10 since it is essential for viral transcription, there is no homologous human protein and 
the residues forming the interface are highly conserved among different viral strains. 
Dynamic interactions between transcription factors play a central role in the 
regulation of transcription and replication, Dimerisation, heterodimerisation and the 
monomer-to-dimer transition may play important roles during the control of the 

15 papillomavirus life cycle. These processes themselves can be regulated through 
phosphorylation, proteolysis, interaction with small ligands or changes in their 
intracellular concentration. It has been suggested that E2 can regulate the switch 
between early gene expression and viral genome replication during HPV infection"**. 
It is possible that dimerisation of E2NT modules plays an essential role during this 

20 process. One scenario would be to activate transcription via induction of DNA loop 
formation at early stages of the viral life cycle. At later stages, when the 
concentration of expressed E2 proteins within the cell becomes high and comparable 
v^th the ATd for E2 dimer formation, free E2NT modules could compete for 
dimerisation with those involved in DNA loop formation and titrate them away, 

25 switching off transcription and stimulating replication. It is also possible that other 
protein factors could be involved in this process, including, for example. El. 

The invention therefore includes the use of E2NT crystal structure in the design of 
anti-viral drugs, since it is essential for viral transcription. In the rationalised 
30 computational design of drugs using the crystal structure, computational analyses are 
therefore necessary to determine whether a molecule or the E2NT-binding portion 

*• 
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thereof is sufficiently similar to the E2NT structure. Such analyses may be carried 
out in current software applications, such as the Molecular Similarity application of 
QUANTA (Molecular Simulations Inc., Waltfaam, Mass.) version 3.3, and as 
described in the accompanying User's Guide, Volimie 3 pages. 134-135. 

5 

The Molecular Similarity application permits comparisons between different 
structures, different conformations of the same structure, and different parts of the 
same structure. The procedure used in Molecular Similarity to compare structures is 
divided into four steps: 1) load the structures to be compared; 2) define the atom 
10 equivalences in these structures; 3) perform a fitting operation; and 4) analyze the 
results. 

Each structure is identified by a name. One structure is identified as the target (i.e., 
the fixed structure); all remaining structures are working structures (i.e., moving 
1 5 structures). Atom equivalency within QUANTA is defined by user input and, for the 
purpose of this invention equivalent atoms may be defined as protein backbone atoms 
(N, C.alpha,, C and O) for all conserved residues between the two structures being 
compared. We will also consider only rigid fitting operations. 

20 When a rigid fitting method is used, the working structure is translated and rotated to 
obtain an optimum fit with the target structure. The fitting operation uses a least 
squares fitting algorithm that computes the optimum translation and rotation to be 
applied to the moving structure, such that the root mean square difference of the fit 
over the specified pairs of equivalent atom is an absolute minimum. This number, 

25 given in angstroms, is reported by QUANTA. 

For the purpose of one class of embodiments this invention, any set of structure 
coordinates of a molecule or molecular complex that has a root mean square 
deviation of conserved residue backbone atoms (N, C.alpha., C, O) of less than 1.5 
30 .ANG. when superimposed-using backbone atoms~on the relevant structure 
coordinates of E2NT are considered identical. More preferably, the root mean square 
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deviation is less than 1.0 .ANG.. Most preferably, the root mean square deviation is 
less than 0.5 .ANG.. 

The term "root mean square deviation" means the square root of the arithmetic mean 
5 of the squares of the deviations from the mean. It is a way to express the deviation or 
variation from a trend or object. For purposes of this invention, the "root mean square 
deviation" defines the variation in the backbone of a protein from the backbone of 
E2NT a dimerising portion thereof, for example as defined by the structure 
coordinates of E2NT described herein. 

10 

The term "least squares" refers to a method based on the principle that the best 
estimate of a value is that in which the sum of the squares of the deviations of 
observed values is a minimum. 

IS Materials and Methods 

Purification and crystallisation. 

Details of the purification and crystallisation of E2NT have been described 
previously^^. Briefly, the ORF encoding the N-temiinal 201 residues of HPV-16 E2 
was cloned into the prokaryotic expression plasmid pETlSb downstream of the 20- 

20 residue His-tag leader sequence; protein was expressed in E. co//BL21(DE3)pLysS 
and purified using nickel affinity and anion exchange chromatography. Crystals were 
obtained by hanging drop vapour diffusion with 0.8-1.2M anunonium sulphate, O.IM 
triethanolamine pH 8.0-8.3 and 3-5% 2-methyl-2,4-pentanediol. Crystals grew only 
with very fresh protein preparations and deteriorated in temis of diffraction quality in 

25 less than a week. This necessitated freezing and storage of crystals in liquid nitrogen 
immediately after growth, as discussed above. 

Structure determination. 

All data were recorded on cryogenically frozen crystals. A native crystal was frozen 
for which initial data were recorded to 3.4 A^^. For the screening of derivatives, 
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crystal stability was even more limiting. Nine crystals were soaked in various heavy 
atom reagents immediately after growth. The crystals were screened in-house using a 
MAR research imaging plate on a Rigaku RU200 rotating anode source, by recording 
3** of data for each and analysing the fractional isomorphous difference from the 
5 native. Three derivatives showed promising differences from the native, in the range 
of 15-20% after scaling using SCALEPACK"*^ and were stored in liquid nitrogen. 
The native crystal was transported to EMBL Hamburg where 1.9 A data were 
measured using synchrotron radiation from beam line XI 1, Table 1. In addition data 
were recorded at EMBL for the three promising derivatives to about 2.7 A. Two of 

10 these derivatives proved usefiil in phase determination and the structure was solved 
by multiple isomorphous replacement with anomalous scattering (MIRAS) at 2.7 A. 
The two derivatives were solved independently using the CCP4 suite^^from the 
difference Patterson synthesis and by direct methods as implemented in SHELX^. 
Both contained a single heavy atom site. Phases, calculated using MLPHARE, were 

15 enhanced by solvent fiattening4^ using a solvent content of 50 %. The resulting high 
quality density map was easily interpretable and the initial model was built using 
QUANTA (Molecular Simulations) for all but four residues of the construct, ignoring 
the His-tag. The model was completed with REFMAC (resolution 20-1.9 A) using a 
bulk solvent correction, to an R-factor of 23.3 % (Rprcc 29.7 % - for 5 % of the data). 

20 There are 221 residues in the recombinant protein: the first twenty comprise the His- 
Tag. The final model contains all but two of the 201 residues of the real protein: 
residues 125-126 are disordered and lie in a flexible surface loop. Only one residue, 
HisO, of the His-tag has clear density and an ordered conformation. In addition there 
are 187 water molecules, which were selected using ARP'^^during the course of 

25 refmement. The main statistics of the refined model are shown in Table 2. 

Analytical ultracentrifugation. 

Experiments were carried out in an Optima XL-A ultracentrifiige (Beckman-Coultier, 
30 CA, USA) using scanning UV optics. During the experiments, the recombinant 
E2NT was in lOmMTrisHCl pH 8.0, 5mM DTT, 0.2 mM EDTA, 300 mM NaCL 
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Data were obtained at rotor speeds of 12,000 and 16,000 rpm, and the time to 
equilibrium was 10-12 hours. All runs were carried out at 293 K, and all radial scans 
were at a wavelength of 280 nm. Dissociation constants were obtained by nonlinear 
regression using the Beckman ultracentrifiige software. 

5 P32059WO 
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Table 1 

X-ray data and phasing statistics 



Data set 


Native 


T T A** 

UAC 




Space Group 


P3i21 


P3i21 


P3i21 


a ,b (A) 


54.68 


54.49 


54.58 


c(A) 


155.73 


155.66 


156.50 


Resolution (A) 


30-1.9 


20-2.7 


20 - 2.7 


Temperature, K 


120 


120 


120 


Wavelength (A) 


0.86 


0.86 


0.86 


Unique reflections 


21751 


7873 


7937 


Completeness (%) 
(outer shell) 


98,8 (89.3) 


99.8(96.1) . 


99.7 (93.8) 


R-merge (outer shell) 


0.058 (0.339) 


0.073 (0.271) 


0.061 (0.268) 


Phasing Power: (centric / acentric) 


1.55/2.07 


0.95 / 1.40 


FOM: MIRAS 


0.59 


FOM: DM 20-2.7 A (2.7 - 1 .9 A) 


0.88 (0.61) 


DM: Mean phase change (20>2.7 A) 


32° 


R-factor (FreeR) 


0.223 (0,295) 
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Table 2 

Refinement and model correlation 

Resolution 

Number of protein atoms 

Number of solvent sites 

Number of reflections used in refinement 

Number of reflections used for Rfree calculation 1111 

K-factor ^ 

Rfree ^ 

Average atomic B-factor*, protein atoms 

water molecules 

R.m.s. deviations from ideal geometry (A). Targets in parentheses 

bond distance 
angle distance 
chiral volume 



1.9-10.0 A 
1622 
211 
20637 

0.232 

0.305 
38.0 
48.5 

0,013(0.020) 
0.026 (0.040) 
0.142 (0.200) 



*Ciystallographic R-factor, R(free) = Z IFol - |Fc|| / Z |Fo 
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Table 3 



25 



30 



35 



40 



45 



CRYST 


54. 


680 


54 


.680 


155. 


730 


90. ( 


30 ' 


90.00 


120.00 


P3121 






SCALEl 




0.01829 




0, 


. 01056 


0. 


,00000 




0.1 


00000 










SCALE2 




0.00000 




0, 


. 02112 


0. 


,00000 




0. 


00000 










SCALE3 




0,00000 




0, 


. 00000 


0, 


,00642 




0.1 


00000 










ATOM 


1 


N 


HIS 


A 




0 


5, 


,469 


-26 


. 512 


52 


.262 


1 


.00 


61. 


92 


ATOM 


2 


CA 


HIS 


A 




0 


6. 


.434 


-25 


.669 


51 


.568 


1 


.00 


61. 


84 


ATOM 


3 


C 


HIS 


A 




0 


6, 


,263 


-25 


.743 


50 


.051 


1 


.00 


53. 


91 


ATOM 


4 


0 


HIS 


A 




0 


6, 


.089 


-24 


.713 


49 


.607 


1 


.00 


69. 


59 


ATOM 


5 


CB 


HIS 


A 




0 


7. 


,837 


-26 


.127 


51 


. 965 


1 


.00 


54 . 


18 


ATOM 


6 


CG 


HIS 


A 




0 


7, 


, 848 


-26 


.468 


53 


. 431 


0 


.00 


99. 


00 


ATOM 


1 


NDl 


HIS 


A 




0 


7 . 


, 914 


-25 


.533 


54 


.412 


0 


.00 


99. 


00 


ATOM 


8 


CD2 


HIS 


A 




0 


7. 


.732 


-27 


.728 


54 


.027 


0 


.00 


99. 


00 


ATOM 


9 


CEl 


HIS 


A 




0 


7, 


,828 


-26 


.215 


55 


.570 


0 


.00 


99. 


00 


ATOM 


10 


NE2 


HIS 


A 




0 


7. 


.723 


-27 


.531 


55 


.370 


0 


.00 


99. 


00 


ATOM 


ai 


N 


MET 


A 




1 


6. 


,663 


-26 


.896 


49 


.478 


1 


.00 


56. 


24 


ATOM 


12 


CA 


MET 


A 




1 


6. 


,435 


-27 


.07 6 


48 


.053 


1 


.00 


56. 


42 


ATOM 


13 


C 


MET 


A 




1 


5. 


.209 


-26 


.282 


47 


.619 


1 


.00 


56. 


07 


ATOM 


14 


0 


MET 


A 




1 


5. 


.293 


-25 


.299 


46 


,911 


1 


.00 


56. 


51 


ATOM 


15 


CB 


MET 


A 




1 


6. 


,216 


-28 


.565 


47 


. 788 


1 


. 00 


60, 


46 


ATOM 


16 


CG 


MET 


A 




1 


6, 


, 856 


-29 


.020 


46 


. 477 


0 


.00 


99. 


00 


ATOM 


17 


SD 


MET 


A 




1 


7. 


,244 


-30 


.775 


46 


.483 


0 


.00 


99. 


00 


ATOM 


18 


CE 


MET 


A 




1 


7 . 


.499 


-30 


.975 


44 


.711 


0 


.00 


99. 


00 


ATOM 


19 


N 


GLU 


A 




2 


4 . 


.035 


-26 


.755 


48 


.064. 


1 


.00 


54 . 


92 


ATOM 


20 


CA 


GLO 


A 




2 


2. 


.803 


-26 


.044 


47 


.744 


1 


.00 


53. 


59 
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1 6 


CG 


MET 
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6. 
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-29. 
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46. 
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17 
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1 


7 . 


244 


-30. 


775 


46, 


483 


0. 
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99. 
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18 


CE 


MET 


A 


1 


7 . 
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-30. 


975 


44 . 


711 


0. 


00 


99. 


00 
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N 


GLU 


A 


2 


4 . 


035 


-26 . 


755 


48 . 


064 


1. 


CO 


54 . 


92 


1 \Jiri 


20 


CA 


GLU 


A 


2 


2 . 


803 


-26. 


044 


47 . 


744 


1. 


00 


53. 


59 


ATOM 


21 


c 


GLU 


A 


2 


2, 


870 


-24 . 


570 


48 . 


154 


1. 


00 


52. 


81 


ATOM 


22 


0 


GLU 


A 


2 


2 . 


555 


-23 . 


664 


47 . 


393 


1. 


00 


51. 


69 


ATOM 


23 


CB 


GLU 


A 


2 


1. 


661 


-26. 


740 


48 . 


482 


1 . 


00 


56. 


88 


ATOM 


24 


CG 


GLU 


A 


~ 2 


2, 


090 


-28 . 


092 


49 . 


054 


0. 


00 


99. 


00 


ATOM 


25 


CD 


GLU 


A 


2 


1 . 


019 


-28 . 


610 


49. 


983 


0. 


00 


99. 


00 


ATOM 


26 


OEl 


GLU 


A 


2 


0. 


454 


-27. 


819 


50. 


722 


0. 


00 


99. 


00 


ATOM 


27 


OE2 


GLU 
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Claims 



5 1. A crystallised molecular complex of an E2 N-terminal module (E2NT) dimer 
protein or homologue thereof, comprising residues vital for transcriptional and 
replicational activities of said protein. 

2. An E2NT dimer protein according to Claim 1 wherein the residues lie on 
1 0 opposite sides of an N-terminal domain. 



3. An E2NT dimer protein according to either preceding claim wherein the 
residues comprise a plurality of residue clusters associated with a structural role at an 
interface between Nl and N2 terminal domains of respective monomers within the 

15 dimer. 

4. An E2NT dimer according to Claim 3 comprising three clusters. 

5. An E2NT dimer according to either of Claims 3 or 4 wherein a first cluster of 
20 vital residues is associated with interactions between Nl and N2 domains and 

comprises any one or more of the following residues Ile82, Glu90, Trp92» Lysll2, 
Tyrl38, Vall45. 

6. An E2NT dimer according to any one of Claims 3-5 wherein a second cluster 
25 of residues is associated with Nl interactions and comprises either or both of residues 

Trp33 and Leu94. 

7. An E2NT dimer according to any one of Claims 3-6 wherein a third cluster of 
residues is associated with N2 interactions and comprises any one or more of the 

30 following residues Prol06, Lysl 1 1, Phel68, Trpl34. 
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8. An E2NT dimer according to any preceding claim further comprising residues 
associated with transactivation and/or replication properties of E2. 

9. An E2NT dimer according to Claim 8 wherein the residues comprise any one 
5 or more of the following residues Glu20, GlulOO, Aspl22, Arg37, Glu39, Ile73, 

GInl2and AIa69. 

10. Use of a crystallised molecular complex of an E2 N-terminal module (E2NT) 
dimer protein according to any preceding claim or homologue thereof in mapping 

10 mutations onto an E2 three-dimensional structure so as to identify areas of amino 
acid conservation and the effect of mutations on folding of the E2 protein. 

1 1 . Use according to Claim 1 0 in rationalised antiviral drug design. 

15 12. An in vitro method for identifying and/or selecting a candidate therapeutic 
agent, the method comprising determining interaction of a E2 N-terminal module 
(E2NT) dimer in a sample by contacting said sample with said candidate therapeutic 
agent and measuring DNA loop formation in E2. 

20 ,13. Use of the method according to Claim 12 in identifying and/or selecting an 
antiviral candidate therapeutic agent. 

14. Use according to Claim 13 wherein identification/selection of the candidate 
therapeutic agent depends on its ability to interfere v^ath or block interactions of 

25 E2NT so as to interfere or block viral and/or cellular transcription factors. 

15. Use of an E2NT dimerisation inhibitor for the preparation of a medicament 
for treatment of conditions that arise as a result of HPV infection. 

30 16. Use according to Claim 15 for the treatment of warts, proUferative skin 
lesions and/or cervical cancer. 
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17. A method of monitoring the efficacy of an antiviral therapy in a patient 
receiving a medicament for the treatment of an HPV infection comprising taking a 
sample from said patient and measuring E2NT mteractions and/or DNA loop 

5 formation. 

18. Use of a dimerisation surface of an crystallised molecular complex of an E2 
N-terminal module (E2NT) dimer protein or homologue thereof according to any one 
of Claims 1-9 as a target site for interaction with putative antiviral agents and/or for 

1 0 measuring efficacy of said agents. 

19. A method for identifying and/or selecting a candidate therapeutic agent, 
comprising applying rationalised drug design to a crystal structure obtainable by 
crystallising E2NT, cryogenically freezing the crystals and generating the crystal 

1 5 structure using X-ray diffraction. 

20. A method of claim 19, wherein the method by which the E2NT crystal 
structure is obtainable comprises crystallisation using hanging-drop vapour diffusion. 

20 21. A method of claim 1 9 or claim 20 wherein the method by which E2NT crystal 
structure is obtainable comprises X-ray diffraction using uranixmi acetate and gold 
cyanide E2NT derivatives and refining with data extending to 1.9 A spacing. 

22. A method of any of claims 19 to 21, wherein the crystal structure comprises 
25 the portions of amino acids Ile82, Glu90, Trp92, Lysll2, Tyrl38, Vall45, Prol06, 

Lysl 11, Phel68, Trpl34, Trp33 and Leu94. 

23. A method of any of claims 19 to 22, wherein the rationalised drug design 
comprises designing drugs which interact with the dimerisation surface of E2NT. 

30 

24. A computer for producing a three-dimensional representation of a molecule or 
molecular complex, wherein said molecule or molecular complex comprises or a 
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three-dimensional representation of a homologue of said molecule or molecular 
complex, wherein said homologue comprises a binding pocket that has a root mean 
square deviation from the backbone atoms of said amino acids of not more than 
1.5 A, xjsdierein said computer comprises: 

5 

(a) a machine-readable data storage medium comprising a data storage material 
encoded with machine-readable data, wherein said data comprises the structure 
coordinates of E2NT amino acids ne82, Glu90, Trp92, Lysll2, Tyrl38, Vall45, 
Prol06, Lysl 11, Phel68, Trpl34, Trp33 and Leu94 accorduig to Table 3; 

10 

(b) a working memory for storing instructions for processing said machine-readable 
data; 

(c) a central-processing unit coupled to said working memory and to said machine- 
15 readable data storage mediimi for processing said machine readable data into said 

three-dimensional representation; and 

(d) a display coupled to said central-processing unit for displaying said three- 
dimensional representation. 

20 

25. The computer according to claim 24, wherein said three-dimensional 
representation is of a molecule or molecular complex is defined by the set of 
structure coordinates according to Table 3, or wherein said three-dimensional 
representation is of a homologue of said molecule or molecular complex, said 

25 homologue having a root mean square deviation from the backbone atoms of said 
amino acids of not more than 1.5 A. 

26. A computer for determining at least a portion of the structure coordinates 
corresponding to an X-ray diffraction pattern of a molecule or molecular complex, 

30 wherein said computer comprises: 
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(a) a machine-readable data storage medium comprising a data storage material 
encoded with machine-readable data, wherein said data comprises at least a portion 
of the structural coordinates according to Table 3; 

5 (b) a machine-readable data storage medium comprising a data storage material 
encoded with machine-readable data, wherein said data comprises an X-ray 
diffraction pattern of said molecule or molecular complex; 

(c) a working memory for storing instructions for processing said machine-readable 
10 data of (a) and (b); 

(d) a central-processing imit coupled to said working memory and to said machine- 
readable data storage medium of (a) and (b) for performing a Fourier transform of the 
machine readable data of (a) and for processing said machine readable data of (b) 

15 into structure coordinates; and 

(e) a display coupled to said central-processing unit for displaying said structure 
coordinates of said molecule or molecular complex. 

20 27. A crystallised molecule or molecular complex comprising a dimerisation 
surface defined by structure coordinates of E2NT amino acids Ile82, Glu90, Trp92, 
Lysll2, Tyrl38, Vall45, Prol06, Lyslll, Phel68, Trpl34, Trp33 and Leu94 
according to Table 3or a homologue of said molecule or molecular complex, wherein 
said homologue comprises a binding pocket that has a root mean square deviation 

25 from the backbone atoms of said amino acids of not more than 1 .5 A. 

28. The crystallized molecule or molecular complex according to claim 27, 
wherein said molecule or molecular complex is defined by the set of structure 
coordinates according to Table 3, or a homologue thereof, wherein said homologue 
30 has a root mean square deviation from the backbone atoms of said amino acids of not 
more than 1.5 A. 
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29. A machine-readable data storage medium, comprising a data storage material 
encoded vsdth machine readable data which, when using a machine progranuned with 
instructions for using said data, is capable of displaying a graphical three- 
dimensional representation of a molecule or molecular complex comprising a 

5 dimerisation surface defined by structure coordinates of E2NT amino acids Ile82, 
Glu90, Trp92, Lysl 12, Tyrl38, Vall45, Prol06, Lysl 11, Phel68, Trpl34, Trp33 and 
Leu94 according to Table 3, or a homologue of said molecule or molecular complex, 
wherein said homologue comprises a binding pocket that has a root mean square 
deviation from the backbone atoms of said amino acids of not more than 1 .5 A. 

10 

30. The machine-readable data storage medium according to claim 7, wherein 
said molecule or molecular complex is defined by the set of structure coordinates 
according to Table 3, or a homologue of said molecule or molecular complex, said 
homologue having a root mean square deviation from the backbone atoms of said 

1 S amino acids of not more than 1 .5 A. 

31. A machine-readable data storage medium comprising a data storage material 
encoded with a first set of machine readable data which, when combined with a 
second set of machine readable data, using a machine programmed with instructions 

20 for using said first set of data and said second set of data, can determine at least a 
portion of the structure coordinates corresponding to the second set of machine 
readable data, wherein: said first set of data comprises a Fourier transform of at least 
a portion of the structural coordinates according to Table 3; and said second set of 
data comprises an x-ray diffraction pattem of a molecule or molecular complex. 

25 

32. A method for evaluating the ability of a chemical entity to associate with a 
molecule or molecular complex according to claim 27 or claim 28 comprising the 
steps of: 
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a. employing computational means to perform a fitting operation between the 
chemical entity and a dimerisation surface of the molecule or molecular complex; 
and 

b. analysing the results of said fitting operation to quantify the association 
5 between the chemical entity and the dimerisation surface. 

33. A drug or therapeutic agent identified, assessed or selected using a 
crystallised molecular complex of an E2NT protein or its crystal structure or using a 
complex of any of claims 1 to 9, a method of claim 12, a use of any of claims 13, 
1 0 claim 14 or 1 8. a method of any of claims 20 to24 or 32 or a product of any of claims 
25 to 31. 
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